Skip to content

Update ruby and mmtk-core repo rev #130

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 8, 2025
Merged

Conversation

wks
Copy link
Collaborator

@wks wks commented Jun 30, 2025

This is a reguler merging commit that synchronizes the MMTK's CRuby fork
with the upstream.

The upstream introduced the imemo:fields type, and is now using it for
the generic_fields_tbl_, i.e. it now holds instance variables for
objects other than T_OBJECT, T_CLASS and T_MODULE. When using
MMTk, we treat the key-value pair in the generic_fields_tbl_ as a
strong edge, i.e. treating the imemo:fields of an object as if it were a
child. We now update the generic_fields_tbl_ like other weak tables.
This simplified the handling of generic fields table in the MMTk
binding.

With imemo:fields added, we now have 17 imemo types including MMTk's
imemo:mmtk_strbuf and imemo:mmtk_objbuf. We increased the header bits
of the imemo type to 5 bits and changed the value of some imemo-specific
header offsets, such as ISEQ_TRANSLATED.

The upstream changed the API for acquiring/releasing the GVL. We make
changes accordingly.

YJIT assumes there is only one thread doing GC. It panics when two GC
worker threads try to mark two iseq objects simultaneously. We made
several changes to support parallel GC for YJIT-compiled iseq objects:

  1. We replaced Block::gc_obj_offsets with Block::gc_obj_addresses
    which are now absolute pointers instead of offsets. By doing so, we
    no longer need to borrow the Rc<RefCell<VirtualMem>> during GC.
  2. rb_yjit_iseq_update_references no longer uses CodeBlock to write
    code, and no longer calls mark_all_executable after updating each
    object. Instead we make the whole code memory writable before
    updating any objects, and make it executable after the updating
    phase finishes. This should both make it friendly to parallel GC
    and improve performance.

wks added 3 commits June 25, 2025 14:51
The upstream has changed its way to handle the generic fields table.
The generic_fields_tbl_ now maps each object to an imemo:fields, and the
imemo:fields object will hold all the fields.  We now treat the
imemo:fields object as a child of its corresponding key in
generic_fields_tbl_, which is just like CRuby's default GC.  Like the
default GC, we handle generic_fields_tbl_ like other weak tables during
weak processing time.

On the Rust side, we replaced the moved_gen_fields_tables hash map with
a simple "backwarding table" that simply maps each moved object using
generic_fields_tbl_ to its old address.
This is a reguler merging commit that synchronizes the MMTK's CRuby fork
with the upstream.

The upstream introduced the imemo:fields type, and is now using it for
the `generic_fields_tbl_`, i.e. it now holds instance variables for
objects other than `T_OBJECT`, `T_CLASS` and `T_MODULE`.  When using
MMTk, we treat the key-value pair in the `generic_fields_tbl_` as a
strong edge, i.e. treating the imemo:fields of an object as if it were a
child.  We now update the `generic_fields_tbl_` like other weak tables.
This simplified the handling of generic fields table in the MMTk
binding.

With imemo:fields added, we now have 17 imemo types including MMTk's
imemo:mmtk_strbuf and imemo:mmtk_objbuf.  We increased the header bits
of the imemo type to 5 bits and changed the value of some imemo-specific
header offsets, such as `ISEQ_TRANSLATED`.

The upstream changed the API for acquiring/releasing the GVL.  We make
changes accordingly.

YJIT assumes there is only one thread doing GC.  It panics when two GC
worker threads try to mark two iseq objects simultaneously.  We made
several changes to support parallel GC for YJIT-compiled iseq objects:

1.  We replaced `Block::gc_obj_offsets` with `Block::gc_obj_addresses`
    which are now absolute pointers instead of offsets.  By doing so, we
    no longer need to borrow the `Rc<RefCell<VirtualMem>>` during GC.
2.  `rb_yjit_iseq_update_references` no longer uses `CodeBlock` to write
    code, and no longer calls `mark_all_executable` after updating each
    object.  Instead we make the whole code memory writable before
    updating any objects, and make it executable after the updating
    phase finishes.  This should both make it friendly to parallel GC
    and improve performance.

The upstream now acquires the GVL when adding a freed fiber to the fiber
pool.  Since MMTk worker threads cannot acquire the GVL, and MMTk only
calls `obj_free` in one GC worker thread, we skip the GVL when using
MMTk.
@wks wks force-pushed the update/merge-2025-06-25 branch from 4dd4675 to 32e8dd0 Compare June 30, 2025 14:47
@wks
Copy link
Collaborator Author

wks commented Jul 7, 2025

The Immix test crashed, but I couldn't reproduce the crash. I recorded it in an issue #131, and I'll restart the test.

Previously we didn't process the newly added cc_refinement table, and
caused crashes.

We re-organized the order of weak table processing functions, and follow
the order as listed in `rb_gc_vm_weak_table_foreach` in `gc.c`, so that
we don't miss a table.  We also replaced functions that get tables with
functions that simply get table sizes.
@wks wks merged commit 221c509 into mmtk:master Jul 8, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant